12 - Compute Tree Canopy
library(knitr)
library(rmdformats)
## Global options
options(max.print="75")
opts_chunk$set(echo=TRUE,
cache=TRUE,
prompt=FALSE,
tidy=TRUE,
comment=NA,
message=FALSE,
warning=FALSE)
opts_knit$set(width=75)Task 1: Compute tree density by neighborhood (I)
In order to compute tree density by neighborhood you need two things. You will need to know the area of the neighborhoods, which you will compute in the next exercise. And you need the tree counts by neighborhood which is the focus of this exercise.
You will produce counts of all trees by neighborhood in NYC and create a single data frame with a column for total trees. The result should be a data frame with no geometry.
Instructions
- Load
sf,raster, anddplyrin your workspace. - Load
trees,neighborhoods,canopy, andmanhattanobjects from Week 02 data folder - Use the
count()function fromdplyron thetreesobject to compute total trees by neighborhood. The neighborhood variable ishood. Call the new objecttree_counts. - Use
head()to take a quick look attree_countsand see if the geometry variable still exists. - Remove the geometry from your
tree_countsdata frame. - The default for the
count()function is to name the count field n. Use therename()function from dplyr to change the column calledntotree_cnt. - Use
hist()to create a quick histogram of the tree counts.
# Load data and libraries
library(sf, raster)
library(dplyr)
library(ggplot2)
# Read in the trees shapefile
trees <- st_read("../../cds-spatial/Week02/data/trees.shp")Reading layer `trees' from data source `/home/cds-au617011/cds-spatial/Week02/data/trees.shp' using driver `ESRI Shapefile'
Simple feature collection with 65217 features and 7 fields
geometry type: POINT
dimension: XY
bbox: xmin: -74.2546 ymin: 40.49894 xmax: -73.70078 ymax: 40.91165
CRS: 4326
# Read in the neighborhood shapefile
neighborhoods <- st_read("../../cds-spatial/Week02/data//neighborhoods.shp")Reading layer `neighborhoods' from data source `/home/cds-au617011/cds-spatial/Week02/data/neighborhoods.shp' using driver `ESRI Shapefile'
Simple feature collection with 195 features and 5 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -74.25559 ymin: 40.49612 xmax: -73.70001 ymax: 40.91553
CRS: 4326
# Read in the tree canopy single-band raster
canopy <- raster::raster("../data/canopy.tif")
# Read in the manhattan Landsat image multi-band raster
manhattan <- raster::brick("../data/manhattan.tif")
# Compute the counts of all trees by hood
tree_counts <- dplyr::count(trees, boroname)
# Take a quick look
head(tree_counts)Simple feature collection with 5 features and 2 fields
geometry type: MULTIPOINT
dimension: XY
bbox: xmin: -74.2546 ymin: 40.49894 xmax: -73.70078 ymax: 40.91165
CRS: 4326
boroname n geometry
1 Bronx 7956 MULTIPOINT ((-73.93187 40.8...
2 Brooklyn 17000 MULTIPOINT ((-74.04088 40.6...
3 Manhattan 6312 MULTIPOINT ((-74.01803 40.7...
4 Queens 23765 MULTIPOINT ((-73.95897 40.7...
5 Staten Island 10184 MULTIPOINT ((-74.2546 40.50...
# Remove the geometry
tree_counts_no_geom <- st_drop_geometry(tree_counts)
# Rename the n variable to tree_cnt
tree_counts_renamed <- rename(tree_counts_no_geom, tree_cnt = n)
# Create histograms of the total counts
ggplot(tree_counts_renamed, aes(x = boroname, y = tree_cnt)) + geom_col() + theme_minimal() +
ggtitle("Total counts of trees by hood") + xlab("Hood") + ylab("Tree count")Great, you’ve completed the first step toward computing tree densities. You now have tree counts by neighborhoods and in the next exercise you’ll compute the neighborhood areas.
Task 2: Compute tree density by neighborhood (II)
We have the tree counts (from the previous exercise). In this exercise you will compute neighborhood areas, add them to the neighborhood sf object and then you’ll join in the non-spatial tree counts data frame from the previous exercise.
Instructions
- Compute the areas of the neighborhoods with
st_area()and wrap them in theunclass()function to remove the units. Save this object asareas. - Use
mutate()to add the areas to your neighborhoods object. Call the new variablearea. left_join()theneighborhoodsobject (which should now have areas) and thetree_countsobject that you calculated in the last exercise.- Use
mutate()with the givenifelse()code to replaceNAs with 0. - Create a new column in
neighborhoodscalledtree_densitywhich istree_cnt/area.
# Compute areas and unclass
areas <- unclass(st_area(neighborhoods))
# Add the areas to the neighborhoods object
neighborhoods_area <- mutate(neighborhoods, area = areas)
# Join neighborhoods and counts by shared columns (find the matching one in
# neighborhoods)
neighborhoods_counts <- left_join(neighborhoods_area, tree_counts_renamed, by = c(boro_name = "boroname"))
# Replace NA values with 0 (lookup ifelse() function if needed)
neighborhoods_counts <- mutate(neighborhoods_counts, tree_cnt = ifelse(is.na(tree_cnt),
0, tree_cnt))
# Compute the density
neighborhoods_counts <- mutate(neighborhoods_counts, tree_density = tree_cnt/area)You’re part way there. You have now computed the tree density variable using, in part, the sf function st_area(). In the next exercises you will compute the average tree canopy values and then compare.
Task 3: Compute average tree canopy by neighborhood
In the previous exercises you computed tree density by neighborhood using tree counts. In this exercise you will compute average tree canopy by neighborhood as a percentage so that we can compare if the results are similar.
Instructions
- Use
head()to confirm that yourneighborhoodsobject has the density results from the previous exercises. - Transform the CRS of the
neighborhoodsobject to match thecanopyobject (load the latter if needed). - Convert the
neighborhoodsobject to anspobject for use with therasterpackage. - Compute the mean of
canopyvalues by neighborhood usingraster::extract()with the argument fun = mean (this might take a minute). - Use
mutate() to add a new columnavg_canopyto neighborhoods, which is equal tocanopy_neighborhoods.
Simple feature collection with 6 features and 8 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -74.00736 ymin: 40.61264 xmax: -73.77574 ymax: 40.8355
CRS: 4326
county_fip boro_name ntacode ntaname boro_code area tree_cnt
1 047 Brooklyn BK88 Borough Park 3 5017229 17000
2 081 Queens QN52 East Flushing 4 2736433 23765
3 081 Queens QN48 Auburndale 4 3173995 23765
4 081 Queens QN51 Murray Hill 4 4876380 23765
5 081 Queens QN27 East Elmhurst 4 1832715 23765
6 005 Bronx BX35 Morrisania-Melrose 2 1569317 7956
geometry tree_density
1 MULTIPOLYGON (((-73.97605 4... 0.003388325
2 MULTIPOLYGON (((-73.79493 4... 0.008684665
3 MULTIPOLYGON (((-73.77574 4... 0.007487409
4 MULTIPOLYGON (((-73.80379 4... 0.004873492
5 MULTIPOLYGON (((-73.8611 40... 0.012967100
6 MULTIPOLYGON (((-73.89697 4... 0.005069720
# Transform the neighborhoods CRS to match the canopy layer
neighborhoods_crs <- sf::st_transform(neighborhoods_counts, crs = raster::crs(canopy,
asText = TRUE))
# Convert neighborhoods object to a Spatial object (optional)
neighborhoods_sp <- as(neighborhoods_crs, "Spatial")
# Compute the mean of canopy values by neighborhood
canopy_neighborhoods <- raster::extract(canopy, neighborhoods_sp, fun = mean)
# Add the mean canopy values to neighborhoods
neighborhoods_avg_canopy <- mutate(neighborhoods_counts, avg_canopy = canopy_neighborhoods)Excellent! Note that you transformed the neighborhoods object’s CRS. This is actually not strictly necessary because extract() can transform CRS on the fly. But it will be needed for plotting and other operations later so doing manually is important here.
Task 4: Create plots using ggplot2
As an initial review of the data you created you will compute correlations and create histograms and a scatter plot using ggplot2.
Instructions
- Load the
ggplot2package. - Create a histogram of total tree density using
ggplot()withgeom_histogram(). The object isneighborhoods_avg_canopyand the variable istree_density. - Create a histogram of average canopy (
avg_canopy) usingggplot()withgeom_histogram(). - Create a scatter plot of
tree_densityon the x-axis againstavg_canopyon the y-axis usingggplot()andgeom_point(). Also add a smooth usingstat_smooth(). - Compute the correlation between tree density and average canopy with
cor()by running the given code.
# Load the ggplot2 package
library(ggplot2)
# Create a histogram of tree density (tree_density)
ggplot(neighborhoods_avg_canopy, aes(x = tree_density)) + geom_histogram(color = "white")# Create a histogram of average canopy (avg_canopy)
ggplot(neighborhoods_avg_canopy, aes(x = avg_canopy)) + geom_histogram(color = "white")# Create a scatter plot of tree_density vs avg_canopy
ggplot(neighborhoods_avg_canopy, aes(x = tree_density, y = avg_canopy)) + geom_point() +
stat_smooth()# Compute the correlation between density and canopy
cor(neighborhoods_avg_canopy$tree_density, neighborhoods_avg_canopy$avg_canopy) [,1]
[1,] -0.3564911
Nice! Ggplot2 is great for creating data graphics and in the next exercise you’ll see that you can use ggplot2 to make maps. In this case the scatter plot and correlation suggest an unexpected result. If anything, the street tree density data and tree canopy data are negatively correlated. You will confirm this with maps in the next tasks and you will learn why.
Task 5: Create a map using ggplot2
The geom_sf() function operates like any other layer in ggplot2 where you can link variables to aesthetics on the plot through the aes() function. In a mapping context this might mean, for example, creating a choropleth map by color coding the polygons based on a variable. If you leave off the aesthetic mapping geom_sf() will map the geometry alone.
devtools::install_github("tidyverse/ggplot2")
Instructions
- Ensure that
ggplot2is loaded in your workspace and the CRS for both objects match (transform if needed. Note that there is no EPSG code, just a proj4string, for this CRS and this is fine.) - Use
ggplot()andgeom_sf()to create a map oftree_densityusing the default colors. - Use
ggplot()andgeom_sf()to create a map ofavg_canopyusing the default colors. - Alter the map of
tree_densityby changing the colors withscale_fill_gradient(). The hex codes (a#followed by numbers) for shades of green have been provided. - Alter the map of
avg_canopyby changing the colors withscale_fill_gradient(). The hex codes (a#followed by numbers) for shades of green have been provided.
# Simplify name
neighborhoods <- neighborhoods_avg_canopy
# Plot the tree density with default colors
ggplot(neighborhoods) + geom_sf(aes(fill = tree_density))# Plot the tree density using scale_fill_gradient()
ggplot(neighborhoods) + geom_sf(aes(fill = tree_density)) + scale_fill_gradient(low = "#edf8e9",
high = "#005a32")# Plot the tree canopy using the scale_fill_gradient()
ggplot(neighborhoods) + geom_sf(aes(fill = avg_canopy)) + scale_fill_gradient(low = "#edf8e9",
high = "#005a32")Great! You’re making progress in improving the graphics. The new layer type geom_sf() is a big help for creating maps in ggplot2. Altering the colors made the maps much more readable and you probably noticed that they seem to show a different pattern. How about doing this with tmap? See the next exercise.
Task 6: Create a map using tmap
The tmap package is an excellent tool for making maps. You’ll see that it simplifies the process of binning your data values for nice maps and makes it easy to stitch together different layers.
tmap operates similarly to ggplot2 in the sense that it starts with a function, in this case tm_shape(), to set up the map and then is followed by layers and other instructions separated with the +.
Instructions
- Load
tmaplibrary in your workspace. - Use
tm_shape()andtm_polygons()to create a map ofneighborhoodswith no color-coding. - Use
tm_shape()andtm_polygons()to create a map ofneighborhoodsand color code bytree_density. - Create a new map by updating your neighborhoods map to include
palette = "Greens"intm_polygons(). - Create a similar map of the variable
avg_canopyto include the following arguments intm_polygons():style = "quantile"n = 7.
# Load tmap library
library(tmap)
# Create a simple map of neighborhoods
tm_shape(neighborhoods) + tm_polygons()# Create a color-coded map of neighborhood tree density
tm_shape(neighborhoods) + tm_polygons(col = "tree_density")# Style the tree density map
tm_shape(neighborhoods) + tm_polygons("tree_density", palette = "Greens", style = "quantile",
n = 7, title = "Trees per sq. KM")# Create a similar map of average tree canopy
tm_shape(neighborhoods) + tm_polygons("avg_canopy", palette = "Greens", style = "quantile",
n = 7, title = "Average tree canopy (%)")You can see the beauty of tmap. It makes nice maps in R relatively easily and provides a lot of flexibility to alter your plot as needed
Task 7: Use tmap to create a final prettier(?) map
In this exercise you will use tmap to create a final map with three map views to help you to compare the green you see in an aerial photo with the tree density and average canopy results you computed by neighborhood. The question you’re trying to answer is which measure, tree density or average tree canopy, more closely matches what you can see in the aerial photo.
Note: In map2, you will make use of the bbox argument to force the view extents to match the aerial photo.
Instructions 1/4
- Remember to load
tmap
- Create and view a map of the
manhattanaerial image withtm_shape()andtm_raster().
- Create and view a map of the
neighboorhoodsborders withtm_shape()andtm_borders().
# Create a map of the neighborhood polygons
tm_shape(neighborhoods) + tm_borders(col = "black", lwd = 0.5, alpha = 0.5)- Combine the
manhattanandneighborhoodsmaps above with a+and save this asmap1(theneighborhoodslayer/map should be second). - Create the second map of tree measures with
tm_shape()andtm_polygons(). - Combine the two maps with
tmap_arrange(), useasp = NAto map height/width to match the bounding box.
library(tmap)
# Your code goes below
map1 <- tm_shape(manhattan) + tm_rgb() + tm_shape(neighborhoods) + tm_borders(col = "black",
lwd = 0.5, alpha = 0.5)
map2 <- tm_shape(neighborhoods) + tm_polygons("tree_density", palette = "Greens",
style = "quantile", n = 7, title = "Trees per sq. KM")
tmap_arrange(map1, map2, asp = NA)Congratulations! You’ve created a nice final map you can share with colleagues to show how average tree canopy is a better representation of green space than the tree density based on the Street Tree Census data from NYC.